On Sampling Type Distribution from Heterogeneous Social Networks
نویسندگان
چکیده
Social network analysis has drawn the attention of many researchers recently. As the advance of communication technologies, the scale of social networks grows rapidly. To capture the characteristics of very large social networks, graph sampling is an important approach that does not require visiting the entire network. Prior studies on graph sampling focused on preserving the properties such as degree distribution and clustering coefficient of a homogeneous graph, where each node and edge is treated equally. However, a node in a social network usually has its own attribute indicating a specific group membership or type. For example, people are of different races or nationalities. The link between individuals from the same or different types can thus be classified to intraand inter-connections. Therefore, it is important whether a sampling method can preserve the node and link type distribution of the heterogeneous social networks. In this paper, we formally address this issue. Moreover, we apply five algorithms to the real Twitter data sets to evaluate their performance. The results show that respondent-driven sampling works well even if the sample sizes are small while random node sampling works best only under large sample sizes.
منابع مشابه
Weighted Random Walks for Meta-Path Expansion in Heterogeneous Networks
In social networks, users and items are joined in a complex web of relations, which can be modeled as heterogeneous information networks. Such networks include a variety of object types and the rich relations among them. Recent research has shown that a hybrid recommendation approach combining components built from extended meta-paths in the network can improve the accuracy of recommendations i...
متن کاملComparative Analysis of Information Dissemination Capabilities of Media and Social Networks
Background and Aim: Human Knowledge depends on data and information that is emerged and transffered from different channels. The dessimination process is different from type, form of transfer, and distribution based on information or awareness. This survey compares the librarians and information scienctist’s information transferring capabilities in mass media and social networks. Methods: This ...
متن کاملSampling from social networks’s graph based on topological properties and bee colony algorithm
In recent years, the sampling problem in massive graphs of social networks has attracted much attention for fast analyzing a small and good sample instead of a huge network. Many algorithms have been proposed for sampling of social network’ graph. The purpose of these algorithms is to create a sample that is approximately similar to the original network’s graph in terms of properties such as de...
متن کاملCommunity Distribution Outlier Detection in Heterogeneous Information Networks
Heterogeneous networks are ubiquitous. For example, bibliographic data, social data, medical records, movie data and many more can be modeled as heterogeneous networks. Rich information associated with multi-typed nodes in heterogeneous networks motivates us to propose a new definition of outliers, which is different from those defined for homogeneous networks. In this paper, we propose the nov...
متن کاملUnbiased Sampling over Online Social Networks
During recent years, Online Social Networks (OSNs) have evolved in many ways and attracted millions of users. The dramatic increase in the popularity of OSNs has encouraged network researchers to examine their connectivity structure. The majority of empirical studies for characterizing OSN connectivity graphs have analyzed snapshots of the system taken in different times. These snapshots are co...
متن کامل